-
Notifications
You must be signed in to change notification settings - Fork 58
support prompt caching token tracking in langchain #1250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Extract cache_read and cache_creation from LangChain's nested input_token_details object and map them to Braintrust's standard metric names (prompt_cached_tokens, prompt_cache_creation_tokens). This enables accurate cache token tracking in the UI and correct cost calculations for cached prompts. Fixes Pylon #10400 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Added a focused unit test for cache-token extraction in langchain-py.
Commit: 08205f4 |
cpinn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks great to me with the added test, was what I was going to take a look into today. Usually for changes like this we also record and save a cassette to have the tests run the tests a little faster.
cc: @ibolmo should we add a cassette to the change or should we merge as is and add it later?
… fix/langchain-cache-tokens
67e1107 to
8542310
Compare
8542310 to
546383a
Compare
|
|
||
| assert "prompt_cache_creation_tokens" in first_metrics | ||
| assert first_metrics["prompt_cache_creation_tokens"] > 0 | ||
| if "prompt_cached_tokens" in first_metrics: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. do we want this if statement in the test because then the assertion may or may not run?
|
braintrust-langchain 0.2.1 and @braintrust/langchain-js 0.2.3 are both out. |
cache_readandcache_creationfrom LangChain's nestedinput_token_detailsobjectprompt_cached_tokens,prompt_cache_creation_tokens)Problem
The LangChain callback handler only extracts top-level fields from
usage_metadata(input_tokens, output_tokens, total_tokens) but not the nestedinput_token_detailscontaining cache metrics. This causes: